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WiFlix: Adaptive Video Streaming in Massive 
MU-MIMO Wireless Networks 


D. Bethanabhotla, G. Caire and M. J. Neely 

Abstract 

We consider the problem of simultaneous on-demand streaming of stored video to multiple users in 
a multi-cell wireless network where multiple unicast streaming sessions are run in parallel and share the 
same frequency band. Each streaming session is formed by the sequential transmission of video “chunks”, 
such that each chunk arrives into the corresponding user playback buffer within its playback deadline. 

We formulate the problem as a Network Utility Maximization (NUM) where the objective is to fairly 
maximize users’ video streaming Quality of Experience (QoE) and then derive an iterative control policy 
using Lyapunov Optimization, which solves the NUM problem up to any level of accuracy and yields an 
online protocol with control actions at every iteration decomposing into two layers interconnected by the 
users’ request queues : i) a video streaming adaptation layer reminiscent of DASH, implemented at each 
user node; ii) a transmission scheduling layer where a max-weight scheduler is implemented at each base 
station. The proposed chunk request scheme is a pull strategy where every user opportunistically requests 
video chunks from the neighboring base stations and dynamically adapts the quality of its requests based 
on the current size of the request queue. Eor the transmission scheduling component, we first describe 
the general max-weight scheduler and then particularize it to a wireless network where the base stations 
have multiuser MIMO (MU-MIMO) beamforming capabilities. We exploit the channel hardening effect of 
large-dimensional MIMO channels (massive MIMO) and devise a low complexity user selection scheme 
to solve the underlying combinatorial problem of selecting user subsets for downlink beamforming, which 
can be easily implemented and run independently at each base station. Eurther, through simulations, we 
show that deploying MU-MIMO significantly improves video streaming performance and also that the 
proposed cross-layer approach is able to serve users more fairly than a baseline scheme representative 
of current systems running independently designed protocol layers. 
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I. Introduction 


Demand for video content over wireless networks has grown dramatically in recent years and it is 
predicted to account for 69% of the total mobile data traffic hy 2018 HI. This is mainly due to on- 
demand video streaming, enabled by multimedia devices such as tablets and smartphones. In addition, 
recent measurement studies 13 reveal that, in 2013, around 26.9% of video streaming sessions on the 
Internet experienced playback interruption due to re-buffering, 43.3% were impacted by low resolution, 
and 4.8% failed to start altogether. At the application layer. Dynamic Adaptive Streaming over HTTP 


(DASH) 0, EH has become a de-facto industry standard approach to handle video streaming over 
wireless networks. In DASH, each user (client) monitors the available capacity during a video streaming 
session and chooses adaptively and dynamically the most appropriate video quality level correspondingly. 
The video files are divided into “chunks”, which are downloaded by sequential HTTP requests. Different 
quality levels can be obtained either by storing multiple versions of the same video encoded at different 
bit-rates, or by using scalable video coding and sending an adaptive number of refinement layers Q. 

In this way, DASH attempts to maintain a reasonable quality of experience (QoE) even under changing 
network conditions. However, operating at the application layer only is not sufficient to achieve a fully 
satisfactory performance. For instance, popular video platforms such as Youtube and Netflix, which 
employ DASH at the application layer, have realized this fact and recently released Video Quality 
Reports 0, m where they compare and contrast different network service providers (ISP) in a given 
geographical area and rank/label them as either Lower Definition (LD) or Standard Definition (SD) or 
High Definition (HD) based on the quality of video streaming activity in their network over a certain 
time frame in order to inform users that the choice of ISP can affect video streaming QoE. 


A. Motivation and related work 

In order to cope with this problem, a cross layer optimization approach has been proposed in several 
works (e.g., see 0- l[T4l '). In these works, the video streaming QoE is defined in terms of performance 
metrics such as video quality, probability of stall events (i.e., when the playback buffer is empty and 
video playback stops), pre-buffering time, and re-buffering time. However, the joint optimization of these 
metrics by directly controlling the dynamics of the playback buffers of all the users in the network 
requires solving a Markov Decision Problem (MDP) which is typically quite difficult and incurs the 

'This includes industry products such as Microsoft Smooth Streaming and Apple HTTP Live Streaming, which qualitatively 
work in the way assumed in our paper, up to minor variations which are irrelevant for the present theoretical treatment. 
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well-known curse of dimensionality. For instance, 0 considers adaptive video transmission in a much 
simpler setting of a point-to-point wireless link and formulates the problem as an MDP which is then 
solved using the value iteration algorithm. However, even in such a simple point-to-point scenario, the 
value iteration policy requires extensive computation to he done offline and stored in a lookup table 
which is then used for the actual transmission. On the other hand, the work |[T3l takes a cross-layer 
approach and considers video delivery in the general case of a multiuser wireless network where users 
are served by wireless helper In order to obtain a tractable formulation for the multiuser network, 

|[T3l adopts a “divide and conquer” approach where first the problem of maximizing a function of the 
time-averaged video qualities, subject to queue stability is solved, and then the delay jitter is taken care 
of by appropriately dimensioning the pre-buffering and re-buffering times, exploiting the fact that the 
playback buffer can absorb the delay fluctuations around the (bounded) mean. However, in l[T^ a “push” 
scheduling policy is considered, for which video chunks can be served out of order and may result in data 
loss in the presence of intermittent connectivity and/or mobility. In this paper, we fix this problem and 
introduce a new “pull” strategy, that is robust to fast topology variations. Our scheme allows each user 
to opportunistically pull data always in the correct sequential order from neighboring helper nodes. This 
results in smoother and more reliable performance. Another shortcoming of ifT^ is that it considers only 
helpers operating according to OFDM/TDMA, i.e., serving at most one user per transmission resource 
(referred to as PHY frame hereafter). As a matter of fact, the current wireless technology trend is rapidly 
evolving towards multiuser MIMO (MU-MIMO) schemes ( e.g., see ifTSl - lfTSl ) where multiple users can 
be served on the same PHY frame by spatial multiplexing. The current work therefore allows for general 
wireless channel models, including MU-MIMO as a special case. 

B. Contributions 

Motivated by the above considerations, this paper focuses on the problem of dynamic adaptive video 
streaming in a wireless network formed by a number of densely deployed wireless helper nodes serving 
multiple wireless users over a given geographic coverage area and on the same shared channel bandwidth. 
We address the problem by jointly optimizing the video quality adaptation at the DASH layer (application 
layer) and the transmission scheduling of users at the PHY/MAC layer. This is obtained through a cross 

^Our treatment applies, at a very high level, to any infrastructure-based wireless network such as conventional cellular, small 
cells, WLAN, and heterogeneous compositions thereof (e.g., a cellular network with wifi off-load). Therefore, throughout this 
paper, we refer to infrastructure nodes simply as “helpers”. 
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layer approach where the appropriate queue sizes maintained at the users act as a bridge between the 
layers. In particular, the novel contributions of this paper are as follows: 

• We introduce the notion of a request queue. This is a virtual queue, maintained by each user, that 
serves to sequentially request video chunks from helper nodes, such that the choice of the helper 
node and the quality at which each video chunk is requested can be adaptively adjusted. Each user, 
upon deciding the quality of the chunk, requests the bits corresponding to that chunk and places 
them in the request queue. Note that this does not mean the user has already downloaded the chunk, 
but the chunk bits are “virtually” placed in the request queue and will be taken out when the chunk 
is effectively delivered to the user. In this way, the user maintains in the request queue all the chunk 
bits that have been requested but not downloaded and adaptively adjusts the quality of future chunk 
requests based on its size. In addition, the user broadcasts this size to the helpers in its current 
vicinity and “pulls” bits from them in the right order necessary for video playback starting at the 
Head Of Line (HOL) of the request queue. Even if a mobile user gets out of range of a helper 
while downloading the HOE bits, it can still re-request those bits from the new helper in its current 
vicinity. In this way, the user always downloads chunks in the playback order and does not skip any 
of them. This improves significantly upon the “push” scheme proposed in |[T3l where the chunks 
could be downloaded out of order due to different transmission queue delays at different helper 
nodes, or skipped if a user moves out of a helper’s coverage after placing a request. 

• We systematically obtain our cross-layer policy as the dynamic solution of a Network Utility 
Maximization (NUM) problem, where the network utility function is given in terms of the users’ 
time-averaged video quality, and the maximization constraints are given by imposing stability of 
each request queue. The stability constraint implies that every requested chunk will be eventually 
delivered, while delivery in the right sequential order is guaranteed by the request queue mechanism 
described above. The proposed policy decomposes naturally into two interconnected layers', i) a video 
streaming adaptation layer reminiscent of DASH, implemented at each user node, and involving the 
adaptive video quality selection and placement of the video chunk requests into the request queue; 
ii) a transmission scheduling layer where a max-weight scheduler is implemented at the helpers. 
These two layers are interconnected by the users’ request queues, which form the weights for the 
max-weight scheduler. Although queue stability guarantees that all requested chunks are eventually 
delivered, such delivery may still occur, occasionally, after the corresponding playback deadline. In 
this case, we are in the presence of a stall event. In order to control the stall event probability and 
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make it sufficiently small, we follow the same divide and conquer approach of |[T3l . and adaptively 
set the pre-huffering/re-buffering time hy monitoring the chunk delivery delay in a sliding window. 
This approach has the advantage of yielding very good performances also in terms of stall event 
prohahility, while allowing for the elegant and mathematically tractable NUM framework in terms 
of the video quality maximization. 

• We particularize the max-weight transmission policy to a network of helpers with MU-MIMO 
capabilities, where the scheduling actions consist of choosing the subset of users for MU-MIMO 
beamforming at each helper. By exploiting the “channel hardening” effect of large-dimensional 
MIMO channels (massive MIMO) ifT^ - ll^ . we reduce the combinatorial weighted sum rate max¬ 
imization over the multiuser multicell network (which would involve an exponentially complex 
exhaustive user selection, or some polynomial complexity heuristic greedy user selection at each 
helper) to a simple subset selection problem which is optimally solved by a low complexity algorithm. 
The algorithm can be implemented independently at the MAC layer of each helper. The only 
information that needs to be exchanged between the layers is the length of the users’ request queues, 
which can be easily gathered as “protocol information” via the uplink, together with the chunk 
requests. 

• We show through simulation in a realistic network topology and using actual encoded video data 
that the proposed system is very effective in improving the average video quality and reducing the 
percentage of time spent in buffering mode. 

II. System Model 

We consider a wireless network with multiple users and multiple helper stations sharing the same 
bandwidth. The network is defined by a bipartite graph Q = where U denotes the set of 

users, T-L denotes the set of helpers, and £ contains edges for all pairs {h, u) such that helper h can 
transmit information to user u. We denote by J\f{u) C the neighborhood of user u, i.e., M{u) = 
{h £ T-L : {h,u) € £}. Similarly, J\f{h) = {u € U : {h,u) £ £}. Each user u £ U requests a video 
file fu which is formed by a sequence of chunks. Each chunk corresponds to a group of pictures (GOP) 
that are encoded and decoded as stand-alone units 0. Chunks have a fixed playback duration, given by 
Tgop = (# frames per GOP)/ rj, where rj is the frame rate, expressed in frames per second. The streaming 
process consists of transferring chunks from the helpers to the requesting users such that the playback 
buffer at each user contains the required chunks at the beginning of each chunk playback deadline. The 
playback starts after a short pre-buffering time, during which the playback buffer is filled by a determined 
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amount of ordered chunks. The details related to pre-huffering and chunk playback deadlines are discussed 
in Section IVTl 

Each file / is encoded at a finite number of different quality/compression levels m ^ {1. Nf} Bl. 

Due to the variable bit rate (VBR) nature of video coding |[22ll . the quality-rate profile of a given file 
/ may vary from chunk to chunk. In particular, we let Df{m,i) and denote the video quality 

measure (e.g., see |[23l ) and the size (in number of bits) of the i-th chunk in file / at quality level m, 
respectively. 

A. Time-scales 

It is important to note that the time scale at which chunks are requested and the time scale at which 
PHY layer transmissions are scheduled differ by 1 — 3 orders of magnitude. For instance, in current video 
streaming technology [31, the typical video chunk spans a duration of 0.5—2 seconds while the duration of 
a PHY frame is of the order of milliseconds^ In the following, we consider dynamic scheduling policies 
that operate at the PHY frame time scale, i.e., they provide a scheduling/resource allocation decision at 
each PHY frame times f S Z. However, new chunks are requested at multiples of the chunk time, i.e., at 
times i = in for f G Z and n denoting the number of PHY frames per chunk time, assumed here to be 
an integer for simplicity. In the rest of the paper, we will use consistently the following notation: index 
t denotes the PHY frame transmission slots, and the index i denotes video chunks. 

B. Request Queue Dynamics 

At the beginning of the i-th chunk time, each user u £ IT requests a particular quality mode for the 
i-th chunk of its video stream. That is, on each slot t G {0, n, 2n, 3n,...}, each user u £ IT specifies 
fhe qualify mode mu{t) £ {1,2,..., A^/„} for its next video chunk. This decision specifies fhe quality 
D and the amount of bits associated with that chunk. As these decisions are 

made only at times t that are multiples of n, it is convenient to define: 

Df^ {mu{t),t) = 0 and Bj^ {mu{t),t) = 0 for t 0 {0, n, 2n,...}. (1) 

The bifs Bf^{mu{t),t) are called fhe requested bits of user u at slot t, and are placed in a request queue 
Qu{t). The request queue evolves over the transmission slots t £ {0,1, 2,...} as: 

Qu{t + l) = Taa 2 L{Qu{t) - + Bf^{mu{t),t),G] \/u£lT, (2) 

^For example, with a PHY frame duration of 10 ms (as in the LTE 4G standard 1241 ) and assuming Tgop = 0.5s, a video 
chunk spans n — — 50 PHY frames. 
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where {t) is the amount of hits downloaded hy user u on slot t. Note that the request queue in Q can 
decrease every transmission slot t as new hits are downloaded, hut can only increase on slots t = in, i.e., 
on integer multiples of re. Intuitively, Quit) consists of hits associated with all chunks that have been 
requested hy user u hut not yet fully received. 

The quantity fiu (t) indicates the instantaneous aggregate downloading rate of user u on slot t, expressed 
in hits per slot. This is given hy 


tl-uit) = ^ ilhuit)lhuit) (3) 

heJV{u) 

where l^u is an indicator function, equal to 1 if helper h has the video file requested hy user u and 
zero otherwise, and fj^huit) is the rate served hy helper h to user u on slot t. The matrix [fJ-huit)] of 
transmission rates is selected within a set of feasible transmission rate matrices for slot t. The set of all 
rate matrices supported by the network at a given slot time t is referred to as the feasible instantaneous 
rate region at time t, and depends on the network topology and channel state (e.g., on the fading channels 
realization). Specifically, let u;(f) represent the topology state on slot t, being a vector of parameters that 
affect transmission, such as current device locations and/or channel conditions. Assume Lo{t) takes values 
in an abstract set fl, possibly being an infinite set. For each wen, define TZ{lo) to be the feasible rate 
region of the network for state io. Then, the feasible instantaneous rate region is TZiuj{t)). For example, the 
set TZiuj) may include the constraint that each user can receive a positive rate from at most one helper 
and/or constrain helpers to restrict transmissions to at most S users, where S denotes the maximum 
number of downlink data streams (spatial multiplexing gain) that the helper station can handleo The 
set TZiio) can also handle models that allow simultaneous download from multiple helpers (for instance, 
in a cellular CDMA system with macro diversity), or information-theoretic capacity regions of various 
network topology models, inclusive of broadcast and interference constraints (e.g., 1261). We also mention 
here that this framework can also handle non-wireless scenarios. For example, it can constrain [fihuit)] 
to be permutation matrices associated with packet switch constraints. However, as explained in Section 
m it is desirable for current and future systems to take advantage of massive MU-MIMO capabilities at 
the helpers. Section |V] specifies 72(a;) for the relevant wireless scenario with helpers employing massive 
MU-MIMO, which is the primary focus of this paper. The simulation results in Section IVIII are carried 
out under this specific wireless model. 

Remark 1: Each user u maintains Quit) and updates it according to (|2]) every transmission slot 
t. A small amount of bookkeeping is also required by the user to associate the bits Quit) with their 


''See (25) for a discussion of various wireless multiple access scenarios and interference models that fit this general framework. 






appropriate chunks. Specifically, each user maintains a list of chunks it has requested but not yet fully 
received, along with the quality modes it requested for each chunk. It can receive new bits on slot t only 
from a helper that has its requested file, and only if (t) > 0. When downloading these bits, the user 
first informs the helper of the requested chunks, the desired quality levels, and the bit location needed 
for downloading the residual bits of the next-in-line chunk. 

III. Problem Formulation and Streaming Policy 

When optimizing the users’ video QoE we have to take into account that users compete for the 
same shared transmission resource (the network wireless spectrum and the helpers spatial downlink data 
streams) and, given the fact that the users are placed in arbitrary positions with respect to the helpers, 
their attainable service rates may be quite different. Hence, some fairness criterion must be enforced. 
In addition, we need to carefully define the notion of QoE, since the adaptive nature of the streaming 
process involves a possibly time-varying quality level across the streaming sessions. 

As already mentioned briefly before, we remark once again that, in order to obtain a tractable formu¬ 
lation, we adopt the divide and conquer approach pioneered in ifT^ : 

1) We first formulate the NUM problem (jH), where the network utility function is a concave and 
component wise non-decreasing function of the time averaged users’ requested video quality and 
the maximization is subject to the stability of all the request queues in the system. 

2) We then solve the NUM problem using the Eyapunov Optimization framework and obtain the 
drift-plus-penalty policy which adapts to arbitrarily changing network conditions and in fact is 
optimal (with respect to the NUM problem) under non-stationary and non-ergodic evolution of the 
underlying network state process. 

3) Since all the request queues in the system are ensured to be stable, the requested video chunks 
are eventually delivered. However, in order to ensure that all the video chunks are delivered within 
their playback deadline, it suffices for every user to choose a pre-buffering time which exceeds 
the largest delay with which a chunk is delivered. In particular, when the maximum delay of each 
request queue in the system admits a deterministic upper bound, setting the pre-buffering time larger 
than such a bound makes the playback buffer under rate zero. However, for a system with arbitrary 
(non-stationary, non-ergodic) evolution of the underlying network state process (for e.g., arbitrary 
user mobility and arbitrary per-chunk fluctuations of video coding rate due to VBR coding), such 
deterministic upper bounds on the maximum delay may not exist or are too loose to be useful in 
practice. Hence, in Section |Vll we propose a method to locally estimate the delays with which 


video chunks are delivered, such that each user can calculate its pre-huffering and re-huffering times 
to he larger than the locally estimated maximum delay. Through simulations in Section IVIIl we 
demonstrate the effectiveness of the comhination of the drift-plus-penalty policy and the adaptive 
pre-huffering scheme. 

In the rest of this section, we focus on the NUM problem formulation and its solution through the drift- 
plus-penalty approach. Throughout this work, we use the following notation for the time average quantity 
of interest: we let Du ■= limi_>.oo y denote the time average of the expected 

quality of user u, and Qu '■= limt^oo \ \Qu (t)] to he the time average of the expected length 

of the request queue at user u, assuming that these limits exist. More in general, we use the overline 
notation to indicate limiting time-averages^ Let (j)u{‘) be a concave, continuous, and non-decreasing 
function defining network utility vs. video quality for user u € U. The NUM problem that we wish to 
solve is given by: 


maximize (pujDu) 

(4a) 

uGU 


subject to < oo y u eU 

(4b) 

Wu{t)] e ^(w(f)) V t 

(4c) 

muit) € { 1 , 2 , ..., Nf^} y u eu, y t, 

(4d) 


where requirement of finite Qu corresponds to the strong stability condition for all the queues 1271 . 

By appropriately choosing the functions 4>u{-), we can impose some desired notion of fairness. For 
example, a general class of concave functions suitable for this purpose is given by the a-fairness network 
utility, defined by |[28l 


(t)u{x) = < 


log X a = 1 
?— a > 0, a 7^ 1 

1—a ’ ' 


(5) 


In this case, it is well-known that 0 = 0 yields the maximization of the sum quality (no fairness), a —>• oo 
yields the maximization of the worst-case quality (max-min fairness) and a = 1 yields the maximization 
of the geometric mean quality (proportional fairness). 

In order to solve problem (m using the stochastic optimization theory developed in |[27l . it is convenient 
to transform it into an equivalent problem that involves the maximization of a single time average. This 


^The existence of these limits is assumed temporarily for ease of exposition of the optimization problem (S but is not required 
for the derivation of the scheduling policy and for the proof of Theorem [T] 
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transformation is achieved through the use of auxiliary variables 7 u(f) and the corresponding virtual 
queues 0 n(t) with buffer evolution: 


&u{t + 1) = max{0„(f) + 7 „(f) - Df^{mu(t),t),0}. 

Consider the transformed problem: 

maximize E 4^u {'lu) 
u&A 

subject to < oo V M W 
Du y U G u 
< ^u{t) < u ^ U 

[Mh«(f)] G ^(w(f)) V t 
muit) G { 1 , 2 ,..., Nf^} y u eU, V f, 


( 6 ) 

(7a) 

(7b) 

(7c) 

(7d) 

(7e) 

(7f) 


where I?™" and are uniform lower and upper bounds on the quality function Notice that 

constraints (TTcIi correspond to stability of the virtual queues 0 „, since 7 „ and Du are the time-averaged 
arrival rate and the time-averaged service rate for the virtual queue given in ®. We have: 

Lemma 1: Problems ([H) and (17]l are equivalent. 

Proof: The proof is well-known (see |[T3l . ETl for instance) and is omitted due to space constraints. 


A. The Drift-Plus-Penalty Expression 

Let Q(f) denote the column vector containing the backlogs of queues Qu M u ^U, let ©(f) denote 
the colunm vector for the virtual queues Qu u ^ U, 7 (f) denote the colunm vector with elements 
7 „(f) ^ u ^lA, B(f) denote the column vector with elements Bf^{mu{t),t) y u gU, D(f) denote the 
column vector with elements DM u and /r(f) denote the column vector with elements 
Pu{t)y u €U as defined in (l3]l. Let G(f) = [Q'''(f), 0^(f)] ^ be the composite vector of queue backlogs 
and define the quadratic Lyapunov function L(G(f)) = ^G^(f)G(f). Intuitively, taking actions to push 
L(G(f)) down tends to maintain stability of all queues. Define A(G(f)) as the one-slot drift of the 
Lyapunov function at slot t : 


A(G(f))^L(G(f + l))-L(G(f)) 


( 8 ) 
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The drift-plus-penalty algorithm is designed to observe the queues, the current for 

all users u and u]{t) on each slot t, and to then choose quality mode mu{t) for all users u, matrix of 
transmitted hits {fJ-huit)) & ^nd 7u(f) subject to < 7„(f) < to minimize a bound 

on the following drift-plus-penalty expression: 

A{G{t))-VY,Mlu{t)) (9) 

ueu 

where 1/ is a non-negative weight that affects a performance bound. Intuitively, the value of V affects 
the extent to which the control actions on slot t emphasize utility maximization in comparison to drift 
minimization. 

Lemma 2: Under any control algorithm, the drift-plus-penalty expression satisfies: 

A(G(f)) - U Mlnit)) <IC-VY, Uluit)) + (B(t) - Q{t) 

u&A u&A 

+ (7(f)-D(f))'^ 0(f). (10) 


where /C is a uniform upper bound on the term 


1 
2 L 


(B(f) - /r(f))^ (B(f) - /r(f)) + ( 7 (f) - D(f))^ ( 7 (f) - D(f)) 


Proof: Expanding the quadratic Lyapunov function, we have 


L(G(f + l))-L(G(f)) 
1 


Q'^(f -h l)Q(f + 1) — + l)0(f + 1) 


- 0 ' 


(max{Q(f) — /r(f) -|- B(f), 0})^ (max{Q(f) - p,{t) -|- B(f), 0}) - Q'’'(f)Q(f) 


-h 


1 

2 L 


(max{0(f) -|- 7 (f) — D(f), 0}) (max{0(f) -|- 7 (f) — D(f), 0}) — 0'’'(f)0(f) 


( 11 ) 


where we have used the queue evolution equations Q and ® and “max” is applied componentwise. 
Using the fact that for any non-negative scalar quantities 0,7 and D we have the inequalities 


(max{0 + J-D, 0})^ < (0 + 7 - U))^ = 0^ + (7 - Df + 20(7 - D), 


( 12 ) 


L(G{t + 1)) - L(G(i)) < i (B(t) - Mt)V (B(() - m) + (B(*) - n(t)7 QW 

+ 1 (7{() - D(i)7 ( 7 (() - D(i)) + (7(«) - D(i)7 0(i) (13) 


we have 
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Under the realistic assumption that the chunk sizes, the transmission rates and the video quality 
measures are hounded above hy some constants, independent of t, the term 


- [(B(t) - (B(t) - + (7(t) - D(t))’' (7(t) - D(t)) 

is hounded above by a constant /C. Using this fact and adding the penalty term — 4>u{'yu{t)) on 
both sides of the inequality ([T3] ) yields the result. ■ 

The drift-plus-penalty (DPP) policy described below acquires information about the queue states G{t), 
the rate-quality profile Df^{-,t)) for all users u and the channel state uj{t) at every slot t, and 

chooses control actions mu{t), [fifiuit)] G TZ{u){t)) and subject to D™™ < 7u(f) < in order 

to minimize the last three terms on the right hand side of the inequality (fTOb . 

The non-constant part in the right hand side of (fTOb can be re-written as: 


BT(f)Q(t)-DT(f)0(t) 


u<eU 




(14) 


The resulting control actions are given by the minimization, at transmission slot t, of the expression in 
(fT4]) . Notice that the first term of ([14]) depends only on mu{t) \/ u the second term of ([T4]) depends 
only on 7(f) and the third term of ([T4l) depends only on Thus, the overall minimization decomposes 
into three separate sub-problems, yielding the layered scheme given below. 


B. The Drift-Plus-Penalty Policy 

We address the minimization of ([T4l) focusing separately on its (separable) components. 

1) Control actions at the user nodes (pull congestion control): The first term in (1141) is given by 

'^{Qu{t)Bf^{mu{t),t) -&uit)Df^ {mu{t),t)}. (15) 

u€U 

The minimization variables mu{t) appear in separate terms of the sum and hence can be optimized 
separately over each user u ^U. Thus, each user observes the queues Quit), Quit) and is aware of the 
the rate-quality profile (i?j^(•,f), (•, t)) on slot t (vidoe meta-data), so that it can choose the quality 

level of the requested chunk at every video chunk slot i, i.e., at transmission slots t G {in : i G Z} as: 

muit) = argmm{Quit)Bf^{m,t) - Quit)Df^{m,t) : m G {1,..., W/„}} . (16) 

As defined in O, for all transmission slots t which are not integer multiples of n, there is no chunk 
requested and therefore Bf^imuit),t) and Df^imuit),t) are equal to be 0. The second term in (IT4l) . 
after a change of sign, is given by 

X i'^M'lnit)) - 7n(f)0«(f)} • 

u&A 


(17) 
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Again, this is maximized by maximizing separately each term, yielding the simple one-dimensional 
maximization (e.g., solvable by line-search): 

7.(i) = argmax - Quith ■ 7 e 

We refer to the policy (O as pull congestion control since each user u selects the quality level at which 
this chunk is requested by taking into account the state of its request queue Qu. It chooses an appropriate 
video quality level that balances the desire for high quality (reflected by the term in 

(fT^ l and the desire for low request queue lengths (reflected by the term Qu{t)Bf^ {m, t) in (fT^ l and then 
opportunistically pulls the chunk at that video quality level from the helpers in its current vicinity. This 
policy is reminiscent of the current DASH technology Q, where the client (user) progressively fetches 
a video file by downloading successive chunks, and makes adaptive decisions on the source encoding 
quality based on its current knowledge of the congestion of the underlying server-client connection. Notice 
also that, in order to compute (O and (fT^ . each user needs to know only local information formed by 
the locally maintained request queue backlog Qu{t) and by the locally computed virtual queue backlog 
Quit). 

2) Control actions at the helper nodes (transmission scheduling): At transmission slot t, the network 
controller observes the queues Quit) of all users u and the topology state a;(f), and chooses the feasible 
instantaneous rate matrix [phuit)] G 7^(t<;(f)) to maximize the weighted sum rate of the transmission 
rates achievable in transmission slot t. Namely, the network of helpers must solve the Max-Weighted 
Sum Rate (MWSR) problem: 

maximize E E Quit)phuit) 

subject to [phuit)] G B-icjit)) (19) 

where 7^(a;(f)) is the feasible instantaneous rate region of the network at slot t. It is immediate to see 
that, after a change of sign, the maximization of the third term in (fTdl) yields the problem (fT9l l. 

IV. Policy Performance 

As outlined in Section HIl VBR video yields time-varying quality and rate functions Dfim,t) and 
Bfim,t), which depend on the individual video file. Furthermore, arbitrary user motion yields slower 
time variations of the pathloss coefficients at the same time-scale of the video streaming session. As a 
result, any stationarity or ergodicity assumption about the topology state a;(f), the rate function Bfim,t) 
and quality function Dfim,t) is unlikely to hold in most practically relevant settings. Therefore, we 
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consider the optimality of the DPP policy for an arbitrary sample path of the topology state the 
quality function Df{m,t) and the rate function Following in the footsteps of ll27l . |[29l . we 

compare the network utility achieved hy our DPP policy with that achieved hy an optimal oracle policy 
with T-slot lookahead, i.e., knowledge of the future sample path over an interval of length T slots. Time 
is split into frames of duration T slots and we consider F such frames. For an arbitrary sample path of 
Df{m,t) and Bf{m,t), we consider the static optimization problem over the j-th frame 


/ ^ 0+i)T-i \ 

maximize f E DfA^uiT),T)\ (20) 

u&A y '^=jT j 

^ (f+l)T-l 

subject to — ^ (m„(r),T) -(r)] < 0 V u G if (21) 

T=jT 

[pUr)] G n^{T)) V T € {jT,..., (j + l)r - 1}, (22) 

muir) € {1,2,... ,NfJ V u eU, V T G {jT,..., {j + l)T - 1}, (23) 


and denote by the maximum of the network utility function for frame j, achieved over all policies 
which have future knowledge of the sample path over the j-th frame subject to the constraints (l2TI) - (l2^ . 
We have the following result: 

Theorem 1: The DPP scheduling policy achieves per-sample path network utility 

E ^ E ^ 7 '" ^ ( f) 

uGW j=0 ^ ^ 

with bounded queue backlogs satisfying 

1 ( 

r=0 \u£U uGU 

where 0{1/V) indicates a term that vanishes as 1/V and 0[V) indicates a term that grows linearly with 
V, as the policy control parameter V grows large. 

Proof: See Appendix lAl ■ 

An immediate corollary of Theorem [T] is: 

Corollary 1: For the system defined in Section ini when the evolution of the topology state io{t), the 
rate function Bf{m,t) and the quality function Df{m,t) is stationary and ergodic, then 

(26) 

u&A ^ ' 


<0{V) 


(25) 
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where i^°p* is the optimal value of the NUM problem (jUl in the stationary ergodic casejj and 

0« < OiV). (27) 

u&A udU 

In particular, if the network state is i.i.d., the hounding term in (l26l ) is explicitly given hy 0(1/17) = y, 
and the hounding term in (1271) is explicitly given hy ^ where 

0max = SmgW> 0 is the slack variable corresponding to the constraint (|2TI) . and the 
constant K, is defined in (ITOl ). 

Proof: See Appendix lAl ■ 

V. Wireless System Model with Massive MU-MIMO Helpers 

In this section, we first specify the region of instantaneous service rates Tl{uj{t)) for the specific PHY 
layer model comprising of massive MU-MIMO at each helper. We then specialize the weighted sum-rate 
maximization problem (fT^ to this system. By exploiting the channel-hardening effect of high dimensional 
MIMO channels, we observe that the MWSR problem is optimally solved by a low complexity greedy 
algorithm which can be implemented in a distributed manner with each helper independently choosing 
user subsets for MU-MIMO beamforming. 

A. Helpers with Massive MU-MIMO 

Each helper h, with a large number of antennas M installed, implements MU-MIMO to serve the users 
M{h) in its vicinity. As a result, helper h can serve simultaneously, in the spatial domain, any subset of 
size not larger than min{M, |A/’(/i)|} of the users in Af{h). We further assume that each helper performs 
linear zero-forcing beamforming (LZFBF) to the set of selected users (referred to in the following as 
“active users”). 

The wireless channel is modeled by the well-known and widely accepted block-fading model, where 
at each transmission slot t, the channel corresponding to the helper-user link {h, u) in E is given by 

yu{t) = + X ^9h'u(t) + Zu{t) (28) 

h'^h 

where is the M x 1 column vector of channel coefficients from the antenna array of helper h 

to the receiving antenna of user u, ghu{t) is the large-scale distance dependent pathloss from helper h 

^Notice that in the stationary and ergodic case the value 0°^’* is generally achieved hy an instantaneous policy with perfect 
knowledge of the state statistics or, equivalently, by a policy with infinite look-ahead, since the state statistics can be learned 
arbitrarily well from any sample path with probability 1, because of ergodicity. 
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to user u, 'V h{t) is the downlink precoding matrix of helper h, and Xft(f) is the vector of transmitted 
complex information symbols (QAM modulation) of helper h. Zu (t) denotes the additive Gaussian noise 
at the u-th user receiver. Notice that this model takes fully into account the inter-cell interference of the 
signals sent hy other helpers h' ^ h, on the link from helper h to user u. 

We use Sh{t) to denote the subset that is chosen for LZFBF in transmission slot t. The M x 1 channel 
vectors /i(f) of all users u G Sh{t) are assumed to be known at the helper h through some form of 
channel state feedback. Such channel vectors are collected as the columns of a M x |<Sft(f)| channel 
matrix The LZFBF precoded signal vector is given by V/j(f)x/i(f) where x/i(i) is the |5ft(f)| x 1 

column vector of symbols to be sent to users u € Sh{t) and V/i(t) is the ZFBF precoding matrix of 
dimension M x |5/i(f)| given by the normalized pseudo-inverse 

V(*) = (29) 


where A{t) is a column-normalizing diagonal matrix with the u-th diagonal element given by 


K{t) 




-1 


(30) 


where [^uu denotes the u-th diagonal element of the matrix argument. Using the fact that 3{^(f)V/j(t) = 
A(f)^/^, the resulting downlink channel to user u G Sh{t) becomes 


yu{t) = V9hu{t)Au{t)Xhu{t) + Zu{t) 


(31) 


where ghu{t) is the large scale pathloss coefficient from helper h to user u. Under the assumptions that 
M,\Sh{t)\ —)• oo with a fixed rafio^^^j^ < 1, random mafrix fheory resulfs (see IITtI . |[301 1 can be 
invoked fo show fhaf 


Au{t) 



M J 


(32) 


Thus, for a given choice of subsef Sh{t) and under fhe assumption fhaf fhe power Ph{t) is equally shared 
across fhe user sfreams in Sh{t), fhe vector = {chu{Sh{t),t)}ueu of rafes (in bifs per channel 
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symbol) achieved by all the users in M{h) is given b}0 such that 

0 


(t), t) < 


In" (1 I Phghuit) 


(33) 


if M ^ Sh{t) 

Hue Sh{t) 

In fact, it is known that the asymptotics kick in very quickly making the rates in (l3^ achievable for 
practical values of M and |5ft(f)|. Notice that the rate expression is independent of the small scale fading 
coefficients. This is because of using a large number of antennas M at the helpers which renders a large 
M X |5/i(f)| random channel matrix H(f) of i.i.d complex Gaussian small scale fading coefficients in every 
transmission slot t. When each helper performs LZFBF in every transmission slot t, the coefficients A„(f) 
given in (l30l) by the reciprocals of the diagonal elements of the inverse Wishart matrix 
“harden” at a deterministic value (l3^ (see |[30l ) due to the large size of the matrix H(f) and the assumption 
< 1. This results in deterministic rate expressions as in (l3^ which are independent of H(f) and are 
just dependent on the large scale path loss coefficients ghu{t)- Furthermore, in the case when the helpers 
are incapable of MU-MIMO, i.e., when the active user subset size |5h(f)| is chosen to be exactly 1, the 
above formula still holds by setting |5/i(f)| = 1 and this is referred to as single user MIMO (SU-MIMO). 

Since helper h can choose an active user subset from the collection of all possible user subsets of 
M{h), the vector {g-hu{t)}u&N'{h) of bits scheduled by helper h to users u in its neighborhood J\f{h) is 
constrained to lie in the discrete set of vectors 


{sc,j(5/j, t) :Sh<^ A((/i)} (34) 

where s is the number of channel symbols available in every transmission slot t. Notice from the rate 
expression in (|3^ that the topology state Lo{t) in this wireless system is given by the vector {ghuit)} of 
large-scale pathloss coefficients between each helper-user pair {h, u) G £. 

We assume that the receiver at every user is advanced in the sense that it can decode multiple streams 
in the same transmission slot, i.e., user u, in transmission slot t, can receive g,u{t) = Yl,h£M{u) dhu{t) 
video-encoded bits by simultaneously downloading ghu{t) bits from helpers h in J\f{u). Notice that 


^This rate expressions neglects the effect of pilot contamination, which arises in massive MIMO with TDD and open-loop 
channel estimation based on uplink pilots and channel reciprocity. While in the regime of infinite number of base station antennas 
and finite number of users, pilot contamination dominates the massive MIMO performance in a multi-cell network fT^ . it is 
well-known that in the more-realistic regime of large but finite number of antennas this effect is typically negligible with respect 
to the residual multiuser inter-cell interference 1201 . ED. Here, for simplicity of exposition and space limitation, we neglect 
these effects and assume that the LZFBF precoder is computed from ideal knowledge of the channel matrix, such that our rate 
expressions are exact under this assumption. However, we hasten to say that our approach is immediately applicable to the case 
of imperfect channel state knowledge, by using the appropriate (more involved) feasible rate expressions. 
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each stream is achievable (in an information theoretic sense), by treating the other streams as Gaussian 
noise, i.e., we do not make use of multiuser detection schemes (e.g., based on successive interference 
cancellation) at the user receivers. Therefore, our rate expressions are representative of what can be 
achieved with today’s user device technology. 

For the sake of comparison, in the simulation results of Section IVIII we also consider a dumb receiver 
heuristic where each user u decodes only the strongest data stream and therefore downloads only 
fihu{t) video-encoded bits. While the dumb receiver heuristic is a degradation of the optimal 
solution involving advanced receivers, the simulation results in Section IVlII show that this degradation is 
almost negligible. This also implicitly indicates that, in most relevant practical topologies and pathloss 
scenarios, it is unlikely that the same user is scheduled by more than one helper in the same transmission 
slot. 

B. Transmission Scheduling with Massive MU-MIMO Helpers 

We now particularize the problem ([T^ to the specific wireless system with massive MU-MIMO helpers. 
For the constraint (IMl) specific fo the wireless system, the general weighted sum-rate maximization 
problem ([T^ reduces to: 

maximize E E Qu{t} dhuil} 

haH uC : M ( h ) 

subject to {Thu{t)]u(iM{h) G {sCh{Sh,t) : Sh U M{h)] he TL. (35) 

This problem decouples into separate maximizations for each helper h given by the following discrete 
optimization problem: 

maximize E Q u{l) hhu{l) 

ueJV { h ) 

subject to {hhu{t)}uGM{h) G {sCh{Sh,t) : Sh Q ./^{h)}. (36) 

The above optimization problem at each helper h essentially corresponds to maximizing the weighted 
sum rate over the discrete set of vectors {sCh{Sh,t) : Sh C N{h)} with an exponential number 
2W { h )\ _ of choices for the active user subset. However, the key observation from rate expression 
(l3^ is that when helper h schedules the subset Sh of users for MU-MIMO beamforming, the rate 
of each user u e Sh depends only on the cardinality |iS/i| but not on the identity of the members of 
the subset Sh - This implies that for a fixed subset size S, the subset lA*{S,t) of users maximizing 
the weighted sum rate can be obtained by sorting the users in J\f{h) according to the weighted rate 
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O (t) loe (1 + !^—S+i _ Phghujt) _A 

Qu[t)iogy.+ ^ 


U^{S,t) = argmax-5 { Quit) log ( 1 + 


and choosing greedily the best S users. Thus, we have 
M-S + 1 PhOhuit) V ^ e Mih) \ , (37) 


y S 1 + Ph'u9h'uit) J J 

where argmax-5 denotes the operation of choosing the first S elements of a set of real numbers sorted 
in decreasing order. 

This sort & greedy selection procedure is repeated for every subset size yielding all the subsets 
{Z7*(5, Then, from these subsets, the subset U*it) which has the maximum weighted sum 

rate is picked as 


U^{t) = argmax < 


^ Q„(f)log[l + 
- W ( S . t ) V 


M-5 + 1 


Ph9huit) 


{uew^{S,t) 


S 1 + '^h'^h^h,'ugh'u{t) 


: Ul{Sp)'i S 


(38) 


yielding the optimal solution to (l36l) . 

A typical sorting algorithm has complexity O (|A/’(/i)| log(|A/’(/i)|)) and since the sorting procedure 
is repeated for every subset size, the algorithm has complexity O (|A/’(/i)p log(|AA(/i)|) which improves 
upon existing user scheduling algorithms ll^ for the MIMO broadcast channel. 


VI. Pre-buffering and re-buffering chunks 

As described in Section HIl the playback process consumes chunks at a fixed playback rate l/Tgop (one 
chunk per video chunk slot i), while the number of chunks received per video chunk slot is a random 
variable due to the fact that is a random process and the transmission resources are dynamically 
allocated by the DPP scheduling policy. In order to prevent stall events, each user u should choose 
its pre-buffering time Tu to be larger than the maximum delay with which a chunk is delivered to it. 
However, such maximum delay is neither deterministic nor known a priori. Moreover, even in special 
cases where the maximum delay of each request queue in the system admits a deterministic bound (e.g., 
see |[25l ). such a bound may be loose and setting the pre-buffering time to be larger than that bound 
might be simply unacceptable in a practical system implementation. We therefore follow the scheme in 
|[T3]| where each user u estimates its local delays by monitoring its delivery times in a sliding window 
spanning a fixed number of video chunk slots. However, the key difference from |[T3l is that the scheme 
in this paper is much simpler since the proposed pull congestion control scheme ensures that chunks are 
received in the right playback order. 

The goal here is to determine the delay T„ after which user u should start playback, with respect to 
the time at which the first chunk is requested (beginning of the streaming session). We define fhe size 
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of the playback buffer as the number of chunks available in the buffer at video chunk slot i but 

not yet played out. Without loss of generality, assume that the streaming session starts at i = 1. Then, 
evolves at the video timescale over video chunk slots z G {1, 2,3,...} asj^ 

= max - 1) - l{i > T„}, 0} + a*. (39) 

where at is the number of chunks which are completely downloaded in the transmission slots between 
t = {i — l)n and t = in. Note that the playback buffer is updated every video chunk slot i, i.e., at 
the time scale of seconds. Thus, if the download of a chunk is completed between t = {i — l)n and 
t = in, from the playback buffer’s perspective, the chunk is considered to have arrived at the end of the 
i-th video chunk slot, i.e., at f = in. Let denote the video chunk slot in which chunk k arrives at 
the user and let Wk denote the delay (measured in video chunk slots) with which chunk k is delivered. 
Note that the longest period during which 'I'u(z) is not incremented is given by the maximum delay to 
deliver chunks. Thus, each user u needs to adaptively estimate Wk in order to choose T„. In the proposed 
method, at each video chunk slot i = 1,2,..., user u calculates the maximum observed delay Ei in a 
sliding window of size A, by letting: 

Ei = max{lTfc : z — A + 1 < < z}. (40) 

Finally, user u starts its playback when 'f'j crosses the level pEi, i.e., Tu = min{z : T'„(z) > pEi\ where 
p is an algorithm control parameter. If a stall event occurs at video chunk slot T, i.e., T'j = 0 for z > T, 
the algorithm enters a re-buffering phase in which the same algorithm presented above is employed again 
to determine the new instant T + + 1 at which playback is restarted. With slight abuse of notation, 

we have re-used to denote the re-buffering delay although it is re-estimated using the sliding window 
method at each new stall event. 


VII. Numerical Experiment 

Our simulations are based on a network topology formed by a 80m x 80m region with 5 helpers 
(indicated by o’s) as shown in Fig. [T] The users (indicated by *’s) are generated according to a non- 
homogeneous Poisson point process with higher density in a central region of size ^mx^m, as shown 
in Fig. □ 

Each helper has M antennas and serves user sets of size upto S, with transmission power of 35dBm. 
The pathloss from a helper to a user is given by ^ , with d representing the helper-user distance 

(assuming a torus wrap-around model to avoid boundary effects). We assume a PHY fame duration of 10 

denotes the indicator function of a condition or event K.. 
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Fig. 1; Simulation setup 


ms and a total system bandwidth of 18 Mhz as specified in the LTE 4G standard. With one OFDM resource 
block (7 X 12 channel symbols) spanning 0.5 ms in time and 180 Khz in bandwidth (corresponding to 
12 adjacent subcarriers each with 15 KHz bandwidth), each transmission slot spans s = 84 x 100 x 20 
channel symbols. 

We assume that all the users request chunks successively from VBR-encoded video sequences. Each 
video file is a long sequence of chunks, each of duration 0.5 seconds and with a frame rate 30 frames per 
second. We consider a specific video sequence formed by 800 chunks, construcfed using several sfandard 
video clips from the database in |[32l . The chunks are encoded into different quality modes with the 
quality index measured using the Structural SIMilarity (SSIM) index defined in |[3^ . The chunks from 
1 to 200 are encoded into 8 quality modes with an average bitrate of 631 kbps. Chunks 201 to 400 are 
encoded in 4 quality modes at an average bitrate of 3908 kbps. Similarly, chunks 401 — 600 and 601 — 800 
are encoded into 4 and 8 quality modes with average bitrates of 6679 kbps and 556 kbps respectively. 
In the simulation, each user starts its streaming session of 1000 chunks from some arbitrary position in 
this reference video sequence and successively requests 1000 chunks by cycling through the sequence. 

We choose the utility function 4>„(-) = log(-) V u € (7 to impose proportional fairness. We set the 
pre-buffering algorithm control parameter (described in Section IVlb p = 3. We simulate our algorithm 
for the layout shown in Fig. [J (with around 500 users generated according to a non-homogenous Poisson 
point process as explained above). At f = 1, all the users simultaneously start streaming 1000 chunks. 

We studied the performance of our algorithm with M = 40 antennas and maximum active user subset 
size 5 = 10 for different values of the policy control parameter V and observed that both the QoE 
metrics average video quality and the % of time spent in buffering mode are satisfactory for the choice 
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(a) 


(b) 


Fig. 2: Performance comparison of advanced and dumb receivers. 


(c) 


oiV = 2* 10^“^. We use that value for the rest of the simulations in this section. 

We now study the performance loss experienced under the dumb receiver heuristic where the receiver 
at every user u decodes only the strongest signal and downloads only max/jg^(„) fXhu (t) in contrast to 
the macro diversity advanced receiver which can decode multiple signals simultaneously and download 
all the Yhh&Miu) hhu{t) bits. Using M = 40 and S = 10, we simulate our algorithm and plot the CDF’s 
over the user population of a) the average video quality h) the average delay in the reception of video 
chunks measured in video chunk slots and c) the % of playback time spent in buffering mode in Figs. 

[2b] and [^respectively. We observe that the performance loss in using a dumb receiver is fairly negligible 
and therefore use a dumb receiver for the rest of the simulations in this section. 

We next study the QoE improvement when MU-MIMO is deployed at the helpers in place of legacy 
SU-MIMO systems. We plot the CDF over the user population of the same video streaming QoE metrics 
as in the previous figures for three different cases 1) MU-MIMO with M = 40 antennas and maximum 
active user subset size S = 10; 2) MU-MIMO with M = 20 antennas and maximum active user subset 
size iS = 5; 3) SU-MIMO with M = 10 antennas. Erom Eigs. [3al [3^ and [3bl we can observe that 
there is significant improvement of video streaming performance in terms of the average video quality, 
the average delay (or alternately the percentage of time spent in buffering mode) when MU-MIMO is 
employed at the PHY layer in comparison to SU-MIMO. This clearly indicates that upgrading current 
SU-MIMO systems to massive MU-MIMO is a promising approach to meet the increasing demands for 
HD video streaming. 

Einally, we study the benefits of using a cross layer approach in comparison to a baseline scheme 
representative of legacy wireless systems. We perform this comparison for the case where every helper 
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Empirical CDF 




(a) (b) 


Empirical CDF 



(c) 


Fig. 3: Video streaming QoE improvement with MU-MIMO over SU-MIMO 


employs SU-MIMO with M = 10 antennas. For the baseline scheme, every user first fixes ifs association 
wifh fhe unique helper fhaf provides fhe maximum received signal sfrengfh (RSSI) PhQhu and fhen uses 
fhe same confrol decision (fT^ fo choose fhe qualify levels for fhe chunks fhaf arrive info fhe request queue 
every video chunk slot. Furthermore, we assume that the helpers locally employ proportional fairness/ 
equal air-time scheduling, i.e., each helper h schedules the users associated with it through the max-RSSI 
scheme in a round-rohin fashion across the transmission slots independent of the request queue lengths at 
the users. This haseline scheme is representative of current practical systems where the decisions across 
different layers are independent and there is no interaction between the upper and lower layers. We plot 
the CDFs over the user population of the average video quality and the average delay in the reception 
of chunks in Figs. |4a| and |4b] respectively. We can observe that the cross layer scheme treats the users in 
a fair manner while the baseline scheme favors some users at the expense of other users in the system. 
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(a) 


(b) 


Fig. 4: Performance comparison of a cross-layer approach with a baseline scheme. 


Appendix A 

Proof of Theorem [Hand of Corollary [H 

As in Section imi we consider the following problem, equivalent to (l20l) - (|2^ . which involves a sum 
of time-averages instead of functions of time averages and introduces the auxiliary variables 7u(f): 


^ ij+l)T-l 

maximize — ^ X! (7 m (t)) (41) 

r—jT u&A 
^ U+1)T-1 

subject to y X! (muir), r) - (r)] < 0 V u £ W (42) 

r=jT 

U+1)T-1 

y [ 7 M('r)-£>/„ (m„(r),r)] < 0 V M eW (43) 

T=jT 

Df" < 1 u ( t ) < V u G W, V r G {jT, ..., (j + 1)T - 1} (44) 

[^Jihu{T)] G 7^(w(T)) V T G OT,... , (j + l)r - 1} (45) 

m„(T) G {1, 2,..., A/J V u G W, V T G {jT,..., (j + 1)T - 1} (46) 


The update equations for the request queues Qu \/ u and the virtual queues M u are given 
in dill and in respectively. Let G(r) = [Q^(r), 0^(r)] ^ be the combined queue backlogs colunm 
vector, and define the quadratic Lyapunov function L(G(r) = |G^(r)G(T). Fix a particular slot r in 
the y-th frame. We first consider the one-slot drift of L(G(t)). From ([T3]) . we know that 


L(G(t + 1)) - L(G(r)) < /C + (B(t) - /i(r))'^ Q(r) + ( 7 (r) - ©(r))"^ 0 (t) 


(47) 
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where /C is a uniform bound on the term 

^ (B(r)-/i(r))’^(B(r)-/i(r))+ ( 7 (r)-D(r))'^( 7 (r)-D(T)) , 

which exists under the realistic assumption that the chunk sizes, the transmission rates and the video 
quality measures are upper bounded by some constants, independent of r. We choose fC such that 

/C > 2k^k (48) 

where k is a vector whose components are all equal to the same number k and this number is a uniform 
upper bound on the maximum possible magnitude of drift in any of the queues (both the request queues 
Qu and the virtual queues 0„) in one slot. With the additional penalty term —V Yhu&A 4’u{7u{t)) added 
on both sides of (l47l) . we have the following DPP inequality: 

L(G(r + 1)) - L(G(r)) - V < JC + (B(r) - At(r))'^ Q(t) + ( 7 (r) - D(r))’^ 0(r) 

uGU 

-VY^u{1u{t)) (49) 

u£U 

Let the DPP policy which minimizes the right hand side of the drift-plus-penalty inequality (l49l ) comprise 
of the control actions ^ u eU, and Since the 

DPP policy minimizes the expression on the RHS of (l49l) . any other policy comprising of the control 
actions {m* V u € if , {l* and {{l4iui'^))}T=j^~^ would give a larger value 

of the expression. We therefore have 

L(G(r + 1)) - L(G(t)) - y fu{lu{T)) <IC + (B*(r) - Q(r) + (7*(r) - D*(t))T 0(r) 

u&A 

-VYMl*uir)). (50) 

u&A 

Further, we note that the maximum change in the queue length vectors Qu{t) and 0 u(t) from one slot 
to the next is bounded by k. Thus, we have for all r € {jT, ..., (y + 1)T — 1} 

IQn(T)-Qu(jT)l<(T-jT)K yueU (51) 

|e„(T)-e„(yT)| < (r-jT)K V m S if (52) 

Substituting the above inequalities in (l50l) . we have 

L(G(r + 1)) - L(G(t)) - F ^ fuhuir)) <IC + (B*(t) - /x*(r))^ (Q(jT) + (r - jT)/^) 

uGU 

+ (7 *(t) - D*(r))T (©(yT) + (r - jT)k) 

u^U 


(53) 
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Then, summing (l5^ over r G {jT,..., (j + 1)T — 1}, we obtain the T-slot Lyapunov drift over the 
y-th frame: 


jT+T-l 

L(G(0-+ 1)T)) - L(G(jT)) - L ^ ^ <^„(7«(r)) 

r—jT u^U 

yjT+T-l ^ 


< /CT + f - M*(r)) ) Q(jT) + f (r - jT) ) 


E„,r ( 7 -W-D-(r)) e(jT) + 


'r=jT 


ugW 


oT+T-1 

■'r=jT 


-T>*{t)){t - jT)] K 


Using the inequalities B*(r) — < 2 k, 7 *(t) — D*(r) < 2 k in (l54l ). we have 

jT+T-l 

L(G((j + l)T))-L(GOT))-U ^ 

r—jT uGU 

T 


<ICT+(Y,ZZ Q(jT) + 2 (^''^j; \t-jT))k^k 


■'r=jT 
^jT+T-l 


^jT+T-l 


r=jT 

-^Y.r=,T 

«ew 


t ( 7 *M-D*(r)) ®UT) + 2iJ2^^.^ (r-jT)U+ 


'r=jT 
^]T+T-1 


( 54 ) 


( 55 ) 


Using k'^k < “ ^T) = we get 

jT+T-l 

L(G((j + 1)T)) - L(G(jT)) - U ^ 

r=jT u^lA 

<K.T + K.T{T - 1 ) + 

f -fr^jT+T-l \ 

+ E,= r h*{r)-^*{r))] ®(jT-^E.= t E'^-( 7 :(r)) ( 56 ) 
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We now consider the policy comprising of the control actions {m* ^ M u ^U, {l*^ 

and {{^i*hui^))}^T=l?~ \ and satisfying the following constraints □ 

0 + l)T-l 

T 


f ^ i'^uir), t ) - < (r)] <-eV u€U 

t=]T 
0 + l)T-l 

f XI bn ('^) - ^*fu {'rnuir), r)] <-ey uGU 


(57) 


(58) 


T=jT 


where e > 0 is arbitrary. We plug in the inequalities (157] ). ([58] ) in (l56l ) and obtain 

jT+T-l 

L{G{{j + l)T))-L{G{jT))-V Y. 

r=jT u^U 


< 


tCT^ -eTY QuijT) -eTY ^uijT) - V Y^]^ ' E M<ir)) 

uGU uGU uGU 

Also, considering the fact that for any vector 7 = ( 71 ,..., 7 |^|) we have 

E = (t>mm < Y ^blu) < </>max = E 

uGU udU 

we can write: 


(59) 


(60) 


u&A 


L(G((j + 1)T)) - L(G(yT)) < KT^ + 

max ^min )-eTY QuijT) -eTY QuijT) (61) 

uGU u&A 

Once again using (ISTI) . (l52l) . we have: 

jT+T-l 

LiGiij + 1)T)) - LiGijT)) < ICT^ + yT(0^ax - - e E E«-M 

T=jT udlA 

jT+T-l 

E Y.^br) + eK\U\TiT-l) (62) 

T=jT u+U 

Summing the above over the frames jG{0,...,F — 1} yields 

FT-l 

LiGiiFT)) - L(G(0)) < ICT^F + VFTicj^^^ - cji^^) - e E E <?«(-) 

r=0 u^hl 

FT-l 

- e E E ®br) + eK\U\FTiT - 1) (63) 

T = 0 Udti 


®It is easy to see that such policy is guaranteed to exist provided that we allow, without loss of generality, for a virtual video 
layer of zero quality and zero rate, and in the assumption that, at any slot t, each user u has at least one link {h, u) £ £ with 
h £ Af{u) n Tfl/u) with peak rate lower bounded by some strictly positive number Cmin- This prevents the case where a user 
gets zero rate for a whole frame of length T. This assumption is not restrictive in practice since a user experiencing unacceptably 
poor link quality to all the helpers for a long time interval would be disconnected from the network and its streaming session 
halted. 
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Rearranging and neglecting appropriate terms, we get 

^ ^ l^Qu{r) + -^ ^ ^Qu{t)<— + ---+ 


T=0 uSU 


r=0 Mgtt 


eFT 


+ k\U\{T-1) 


(64) 


Taking limits as F —oo 


eYe «"(-) + E e«(-)') < ^ + ~ + k|«|(t -1) 

r=0 \u£lA udU / 


(65) 


such that (1251) is proved. 

We now consider the policy comprising of the decisions which achieves the optimal solution to 
the problem (IdTl) - (|431) . Using (l42l) and (l43l) in (l56l) . we have 

jT+T-l 

L(G((j + i)r)) - L(G(ir)) - u JY JY</-47„(r))<Fr + /cr(r-i)-ur0°p* ( 66 ) 

T=jT u^lA 

Summing this over j € {0,..., F — 1}, yields 

FT-l F-l 

F(G((Fr)) - L(G(0)) - ^ JY JY Mlu{r)) < KT^F -VTY, <t>T■ (67) 


r=0 new 


i=o 


Dividing both sides by VFT and using the fact that F(G((FT)) > 0 , we get 

^ E E ^Mr)) > - E 


F(G(0)) 


T=0 uFU 


2=0 


UTF 


( 68 ) 


At this point, using Jensen’s inequality, the fact that i^u(-) is continuous and non-decreasing for all 
u € Z7, and the fact that the strong stability of the queues (1651) implies that limi?^oo 7 ^ < 

00 'i u £U, which in turns implies that 7 ^ < Du V u € Z7, we arrive at 


E 

uFU 


F-l 


> lim — 
F^oo F ^ 


ft 


2=0 


KT 


(69) 


such that (l24b is proved. 

Thus, the utility under the DPP policy is within 0(1/U) of the time average of the (ft utility values 
that can be achieved only if knowledge of the future states up to a look-ahead of blocks of T slots. If 
T is increased, then the value of ft for every frame j improves since we allow a larger look-ahead. 
However, from (l69l) . we can see that if T is increased, then V can also be increased in order to maintain 
the same distance from optimality. This yields a corresponding 0(V) increase in the queues backlog. 

For the case where the rate function the quality function Df{m,t) and the topology state 

a;(f)is stationary and ergodic, the time average in the left hand side of (l65l) and in the right hand side 
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of (|6^ become ensemble averages because of ergodicity. Thus, we obtain (|2^ and (l27l) . Furthermore, 
if the network state is i.i.d., we can take T = 1 in the above derivation, obtaining the bounds given in 
Corollary [T] 
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